Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Contrastive learning is a powerful framework for learning discriminative representations from image-text pairs. Despite its success, its theoretical foundations, especially when the image-text pair exhibits misalignment, remain underexplored. This paper provides the first theoretical analysis of contrastive learning under data misalignment, proving how the ground-truth modality-paired features are amplified while spurious features are suppressed through the training dynamics analysis. Specifically, we study two nonlinear encoders trained jointly with a contrastive loss and demonstrate that noisy (or misaligned) data pairs result in mixed representations and degrade the model's generalization ability. In contrast, recaptioning and filtering improve the data alignment, which in turn purifies the features learned by neurons and subsequently enhances generalization. Our analysis identifies feature purity as a key factor in the success of contrastive learning and offers insights into how data quality and training procedures impact representation learning and downstream generalization. Theoretical insights are supported by experiments on standard benchmarks.more » « less
-
Conventional biological nitrogen removal (BNR) processes for mainstream municipal wastewater (MMW) treatment have high energy and chemical costs. Partial nitritation/anammox (PN/A) has the potential to reduce the carbon footprint of BNR; however, its implementation for MMW treatment has been limited by the low ammonium and high organic matter concentrations in MMW, which prevent suppression nitrite oxidizing bacteria (NOB) and heterotrophic denitrifiers. In this study, after organic carbon diversion, ammonium was separated from MMW in a novel bench-scale sequencing batch biofilm reactor (SBBR) containing chabazite, a natural zeolite mineral with a high ammonium ion exchange (IX) capacity. After breakthrough, chabazite was bioregenerated by PN/A biofilms. Recirculation was applied from the bottom to the top of the column to create an aerobic zone (top) for ammonia-oxidizing microorganisms (AOM) and an anoxic zone (bottom) for anammox bacteria. Rapid IX-PN/A SBBR startup was observed after inoculation with PN/A enrichments. The time required for bioregeneration decreased with increasing recirculation rate, with high total inorganic nitrogen (TIN) removal efficiency (81 %) and ammonium removal rate (0.11 g N/L/day) achieved at recirculation velocity of 1.43 m/h. The core microbiome of the IX-PN/A SBBR contained a high abundance of bacteria of the phylum Pseudomonadota (15.27–20.62 %), Patescibacteria (12.38–20.05 %), Chloroflexota (9.36–14.23 %), and Planctomycetota (7.55–12.82 %), while quantitative PCR showed the highest ammonia monooxygenase (amoA, 2.0 × 102) and anammox copy numbers (amx, 1.0 × 104) in the top layers. The single-stage IX-PN/A SBBR achieved stable BNR for >two years without chemical inputs, media replacement or brine waste production.more » « less
-
Task arithmetic refers to editing the pre-trained model by adding a weighted sum of task vectors, each of which is the weight update from the pre-trained model to fine-tuned models for certain tasks. This approach recently gained attention as a computationally efficient inference method for model editing, e.g., multi-task learning, forgetting, and out-of-domain generalization capabilities. However, the theoretical understanding of why task vectors can execute various conceptual operations remains limited, due to the highly non-convexity of training Transformer-based models. To the best of our knowledge, this paper provides the first theoretical characterization of the generalization guarantees of task vector methods on nonlinear Transformers. We consider a conceptual learning setting, where each task is a binary classification problem based on a discriminative pattern. We theoretically prove the effectiveness of task addition in simultaneously learning a set of irrelevant or aligned tasks, as well as the success of task negation in unlearning one task from irrelevant or contradictory tasks. Moreover, we prove the proper selection of linear coefficients for task arithmetic to achieve guaranteed generalization to out-of-domain tasks. All of our theoretical results hold for both dense-weight parameters and their low-rank approximations. Although established in a conceptual setting, our theoretical findings were validated on a practical machine unlearning task using the large language model Phi-1.5 (1.3B).more » « less
-
Deep learning (DL) system research is often impeded by the limited availability and expensive costs of GPUs. In this paper, we introduce GPEmu, a GPU emulator for faster and cheaper prototyping and evaluation of deep learning system research without using real GPUs. GPEmu comes with four novel features: time emulation, memory emulation, distributed system support, and sharing support. We support over 30 DL models and 6 GPU models, the largest scale to date. We demonstrate the power of GPEmu by successfully reproducing the main results of nine recent publications and easily prototyping three new micro-optimizations.more » « less
An official website of the United States government

Full Text Available